Understanding Global Dynamics: Population,
Health Systems, Gender Equality
Teammates:
1. Jaynica Nunna 11697960
2. Vitesh Chalicheemala 11689328
3. Mounika Vankayalapati 11714417
4. Supriya Ravilla 11644767
Introduction
Our project focuses on understanding crucial aspects of global
dynamics, including population trends, healthcare systems, and
gender equality.
By analyzing and visualizing data related to these domains, we aim to
uncover insights that contribute to informed decision-making and
drive positive societal change.
Through this endeavor, we seek to provide valuable insights into the
complex interplay of factors shaping our world today.
Data Abstraction
Dataset:
We utilized three datasets sourced from reputable sources such as the Kaggle and one from given ten
choice of datasets.
These datasets encompassed a wide range of attributes, including population demographics, healthcare
infrastructure, innovation indices, and gender inequality metrics.
Number of Records:
The combined datasets comprised thousands of records, with each record representing a specific
country or region.
The exact number varied across datasets but totaled well over a thousand records in aggregate.
Data Transformation:
Before analysis and visualization, we performed extensive data preprocessing to clean and harmonize
the datasets. This involved tasks such as handling missing values, standardizing units of measurement,
and ensuring data consistency across different sources.
Above Workflow Explaination
Data Collection:
Getting relevant datasets from Kaggle, a popular platform for data science datasets.
Initial Visualization:
Utilizing D3.js to visualize the uncleaned dataset, providing an initial overview of the data's structure and
potential insights.
Data Cleaning:
Using Python programming language and libraries like Pandas and NumPy to clean the dataset, handling
missing values, outliers, and inconsistencies.
Refined Visualization:
Using Python's data visualization libraries such as Matplotlib or Seaborn to create more refined visualizations
based on the cleaned dataset, focusing on specific variables of interest.
Dashboard Creation:
Utilizing Microsoft Power BI to design interactive dashboards and reports, incorporating the refined
visualizations to provide a comprehensive view of the analyzed data.
Report Generation:
Creating detailed reports summarizing the analysis findings, insights, and conclusions drawn from the
visualizations and data analysis process.
Task Abstraction
Task:
Target:
To analyze and visualize key aspects of global dynamics, including population demographics, healthcare
systems, and gender equality metrics.
Actions:
Data Collection: Gather relevant datasets from Kaggle, focusing on population demographics, healthcare
infrastructure, innovation indices, and gender inequality metrics.
Initial Visualization: Utilize D3.js to create preliminary visualizations of the uncleaned dataset, providing an
initial overview of its structure and potential insights.
Data Cleaning: Employ Python programming language and libraries like Pandas and NumPy to clean the
dataset, handling missing values, outliers, and inconsistencies.
Refined Visualization: Use Python's data visualization libraries such as Matplotlib or Seaborn to generate
more polished visualizations based on the cleaned dataset, highlighting specific variables of interest.
Dashboard Creation: Utilize Microsoft Power BI to design interactive dashboards and reports, integrating the
refined visualizations to offer a comprehensive view of the analyzed data.
Report Generation: Create detailed reports summarizing analysis findings, insights, and conclusions drawn
from the visualizations and data analysis process, facilitating informed decision-making.
Implementation using Tools:
D3.js:
Description: D3.js (Data-Driven Documents) was utilized for the initial visualization of the uncleaned dataset. It
provided a platform for creating dynamic and interactive charts and graphs directly within web browsers.
Usage: Through D3.js, we generated interactive visualizations such as bar charts, scatter plots, and heatmaps to
explore the structure and patterns within the raw dataset.
Python:
Description: Python, along with libraries like Pandas, NumPy, Matplotlib, and Seaborn, played a crucial role in data
preprocessing, analysis, and visualization.
Usage: Pandas and NumPy were employed for data cleaning tasks, including handling missing values, outliers, and
inconsistencies. Matplotlib and Seaborn were used to create refined visualizations based on the cleaned dataset,
showcasing insights through various types of charts and plots.
Microsoft Power BI:
Description: Microsoft Power BI served as a comprehensive platform for designing interactive dashboards and reports,
integrating visualizations to provide a comprehensive view of the analyzed data.
Usage: Power BI enabled us to create interactive visualizations, including bar charts, line graphs, and maps, and
seamlessly integrate them into interactive dashboards. Additionally, Power BI's data modeling capabilities allowed for
the creation of relationships between different datasets, enhancing the depth of analysis in the final reports.
Results for
Analysis
Using D3
This visualization represents the
population of the top ten countries
in 2023.
Each bar's height corresponds to the
population size of a specific country.
By comparing the heights of the
bars, we can see which countries
have larger populations relative to
others
And data used here is not
preprocessd or cleaned.
Using D3
we're visualizing the Global
Innovation Index (GII) of the top ten
countries.
Each bar represents a country, and
the height of the bar indicates its GII
score.
This visualization allows us to
compare the innovation levels of
different countries, with taller bars
indicating higher innovation scores.
And data used here is not
preprocessd or cleaned.
Data Pre-processing
using python
After preprocessing datasets, we
stored all three datasets in to csv.,
using pandas.
Later using Visualisation libraries
we created below visuals.
Relationship between
pysicians and birth
registration
In terms of healthcare systems, each
point in this scatter plot represents a
nation's journey towards wellness.
As we see the axis of physicians per 1000
people, we uncover a tale of access to
healthcare.
Meanwhile, the completeness of birth
registration whispers a story of
governance and care, painting a picture
of a nation's commitment to its citizens'
well-being.
On top we have shown same visual
using D3 , that’s before data
preprocessing.
Population health
heatmap
Across the world, this map paints a
picture of people's presence.
Each color showcases the density
of population, offering a
straightforward glimpse into where
people gather, from bustling cities
to quieter regions.
Gender inequality index
map
This map provides a straightforward
view of gender inequality across the
world.
Darker colors indicate higher levels
of inequality, while lighter shades
represent areas with less disparity
between genders.
It's a clear snapshot of the global
landscape of gender equality efforts.
Next we are going to
create more visual
insights using powerBI
by embedding report
here.
Microsoft Power BI
Microsoft Power BI
Microsoft Power BI
Microsoft Power BI
Work Management: Implementation Status
Report
Work Completed:
Data Collection:
Description: Gathered relevant datasets from Kaggle.
Responsibility: All team members contributed to identifying and selecting datasets.
Contributions: Equal contribution from all team members.
Initial Visualization (D3.js):
Description: Created preliminary visualizations of the uncleaned dataset using D3.js.
Responsibility: Vitesh Chalicheemala
Contributions: Vitesh contributed 100% to this task.
Data Cleaning (Python):
Description: Cleaned the dataset using Python, Pandas, and NumPy.
Responsibility: Jaynica Nunna, Mounika Vankayalapati
Contributions: Jaynica and Mounika each contributed 50% to this task.
Refined Visualization (Python):
Description: Created refined visualizations based on the cleaned dataset using Python's data visualization libraries.
Responsibility: Supriya Ravilla
Contributions: Supriya contributed 100% to this task.
Work Management: Implementation
Status Report
Dashboard Creation (Microsoft Power BI):
Description: Designed interactive dashboards and reports using Microsoft
Power BI.
Responsibility: All team members collaborated on dashboard design and
implementation.
Contributions: Equal contribution from all team members.
Report Generation:
Description: Created detailed reports summarizing analysis findings.
Responsibility: Jaynica Nunna
Contributions: Jaynica contributed 100% to this task.
Overall Contributions
Jaynica Nunna: 25%
Vitesh Chalicheemala: 25%
Mounika Vankayalapati: 25%
Supriya Ravilla: 25%
References
1. Kaggle. (n.d.). Kaggle: Your Home for Data Science. Retrieved from
https://www.kaggle.com/
2. D3.js. (n.d.). D3.js - Data-Driven Documents. Retrieved from
https://d3js.org/
3. Python Software Foundation. (n.d.). Python. Retrieved from
https://www.python.org/
4. McKinney, W., Perktold, J., Seabold, S., & Wes McKinney. (2020). pandas-
dev/pandas: Pandas 1.2.4. Zenodo. https://doi.org/10.5281/zenodo.4611042
5. Hunter, J. D. (2007). Matplotlib: A 2D Graphics Environment. Computing in
Science & Engineering, 9(3), 9095. https://doi.org/10.1109/MCSE.2007.55
6. Microsoft Corporation. (n.d.). Power BI. Retrieved from
https://powerbi.microsoft.com/